Moving Beyond the Minds Eye Through AI

16/08/24 08:00 Filed in: AI for the Creatiive Mind

Introduction

For the past two years, I have immersed myself in all things AI, from its core technology to various LLMs, API integration, prompt engineering, and custom GPTs, as well as all AI tools related to media production. I've been experimenting with as many tools as possible and continue to have "holy sh@#" moments just about every day. To better understand what's possible, I embarked on a new challenge project. I'd like to share a few things I've discovered and what ideas I’ve been developing on process and AI production workflows.

A long-time buddy of mine, a creative storyteller, has written several books in the urban fantasy genre. I would characterize this guy and authors in general as Creative Minds. Traditionally, storytellers usually start by capturing and expressing their ideas through the written word, describing characters and scenes in vivid detail so readers can visualize them. However, we are now at a crossroads where AI tools can help creative minds express stories not just in written format but across all media. This new era requires a multimodal mindset and a combinatorial approach, using multiple tools to bring ideas to life.

To test this, I proposed the idea of creating a 1-minute video trailer for his first book entirely through AI. My buddy dug the idea and was kind enough to lend his content for this experiment.

Here’s the workflow I decided to use to tackle this project:

Project Workflow Plan

Step 1: Create a Custom GPT
I started with the book’s text and ChatGPT to create a custom GPT using the manuscript as the core knowledge base. This allowed me to interact with an assistant who had expert knowledge of the content, enabling me to ask detailed questions 24/7 about the story, characters, and their relationships. It’s like having a conversation with a friend who is an expert on the book.

Step 2: Generating Story Summary and Trailer Script
Next, I used the CustomGPT to create a summary of the book story. From here researched the characters, then use GPT to create multiple ideas and trailer script options. Next I iterated back and forth with my new GPT expert content assistant to finalize the VO script, and finally used GPT to develop story boards for the video trailer in preparation for the media creation.

Step 3: Generating Visual Content
Next, I developed prompts based on character descriptions from the GPT and used them in MidJourney and DALL-E to create visual representations of the characters and environments. This process took many iterations to fine-tune the images, which I further refined using Photoshop’s AI tools to better fit the visual needs of the project and prep for the next steps.

Step 4: Creating Voice-Overs
For character snd narrator voices, I worked with my GPT assistant to craft prompts that would generate AI voices with the right inflections and characteristics. Using various AI voice tools (WellSaid Labs, ElevenLabs), I brought these characters to life with high-quality AI-generated voice-overs.
Step 5: Composing the Soundtrack
To match the cinematic feel of the book, I used an AI music generation tool (Udio), guided by prompts describing the envisioned soundtrack. This step also required several iterations, but the outcome was impressive and captured the essence of the book’s tone.

Step 6: Building a Sound Effects Track
To enhance the overall experience, I explored several AI tools for generating sound effects. Although I didn't find any that fully met my needs, I managed to incorporate a few successful additions into the project. This step highlighted the experimental nature of working with AI tools and the importance of persistence.

Step 7: Animating the Characters
To bring the characters to life with motion, I used the images created and prepared using the AI image tools as source or “seed” prompts for the motion stage. I explored 3D model and video AI tools. I ended up using Luma and other video tools to animate the characters within the scenes, adding movement to the static visuals.

Step 8: Final Assembly
With all these components in place, I used a video editing tool to combine the visuals, voice-over narration, music, and sound effects into a cohesive video. Although I couldn't find a specific AI tool that could integrate all these media assets into one final product, using a digital video editor allowed me to assemble the project effectively. This process underscored the need for a combinatorial approach, utilizing multiple AI tools to handle different aspects of production.

Refection on AI Tools Today

One thing I've noticed is that most AI tools today offer coarse adjustments, like a big honkin’ knob that moves from 1, to 5, to 10, allowing for only broad changes. However, I can see a future where we'll have fine-tuned sliders that provide more precise control over all aspects of the content parameters. This advancement will enable even greater accuracy and refinement in AI-generated content. Multimodal tools like LTXStudio are coming soon, and it will be fun to see how they evolve.

This project is still a work in progress, but it demonstrates how AI can support and enhance the creative mind. The technology is rapidly evolving, and while current tools require a lot of sometimes hit-or-miss iterations, the potential for seamless, integrated AI solutions is on the horizon. The video included provides some of the content I’ve been able to create thus far. These assets may or may not be used in the final trailer but are meant to show the challenges and current output of my experimentation and creative pursuits.

Stay tuned for future articles, where I’ll share the final product and more insights from this exciting journey. Embracing AI is not just about staying relevant; it’s about pushing the boundaries of what’s possible in creative expression.

Onward… forward!
Check out the first iteration of the trailer. This involved using the Custom GPT to generate character descriptions and prompts to create images of a few of the characters, voices and a music track.

#AI #ArtificialIntelligence #Creativity #AIArt #TechInnovation